Google DriveでPDFの中身を検索する
2021/5/15 結論:詳細不明
The maximum size for images (.jpg, .gif, .png) and PDF files (.pdf) is 2 MB. For PDF files, we only look at the first 10 pages when searching for text to extract.
この文章は2021/5/15時点ではドキュメントに存在しない
OCRの制約(Google docのテキストに変換する)場合 These tips will give you the best results:
Format: You can convert .JPEG, .PNG, .GIF, or PDF (multipage documents) files.
File size: The file should be 2 MB or less.
Resolution: Text should be at least 10 pixels high.
Orientation: Documents must be right-side up. If your image is facing the wrong way, rotate it before uploading it to Google Drive.
Languages: Google Drive will detect the language of the document.
Font and character set: For best results, use common fonts such as Arial or Times New Roman.
Image quality: Sharp images with even lighting and clear contrasts work best.
200枚のPDFをアップして検索まで5分かかったり7時間かかったりする
indexingされたかどうかはAPIでわかる。優先度は変更不可
OCRにミスがない、テキストに変換されてPDFが検索できない
2022/12/16追記: 日本語の意味がわからない
Google Drive上でPDFを開いた後に検索するとヒットする
2019/12/31
Whenever the PDF is standard or low resolution (instead of high resolution) the search feature works as expected, both in Drive and within the PDF. But when the PDF is high-resolution, search only works within the PDF, and NOT in Drive.
ファイルサイズに依存?